Overview

Dataset statistics

Number of variables17
Number of observations19001
Missing cells7531
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.5 MiB
Average record size in memory136.0 B

Variable types

Numeric11
Categorical5
DateTime1

Warnings

name has a high cardinality: 18780 distinct values High cardinality
host_name has a high cardinality: 6307 distinct values High cardinality
neighbourhood has a high cardinality: 215 distinct values High cardinality
last_review has 3758 (19.8%) missing values Missing
reviews_per_month has 3758 (19.8%) missing values Missing
df_index is uniformly distributed Uniform
name is uniformly distributed Uniform
df_index has unique values Unique
id has unique values Unique
minimum_nights has 5003 (26.3%) zeros Zeros
number_of_reviews has 3758 (19.8%) zeros Zeros
availability_365 has 6970 (36.7%) zeros Zeros

Reproduction

Analysis started2023-05-06 15:32:08.225078
Analysis finished2023-05-06 15:32:35.878003
Duration27.65 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct19001
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10003.44161
Minimum0
Maximum19999
Zeros1
Zeros (%)< 0.1%
Memory size148.6 KiB
2023-05-06T17:32:36.007538image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1011
Q15013
median10005
Q314995
95-th percentile19003
Maximum19999
Range19999
Interquartile range (IQR)9982

Descriptive statistics

Standard deviation5769.642118
Coefficient of variation (CV)0.5767657116
Kurtosis-1.197212199
Mean10003.44161
Median Absolute Deviation (MAD)4991
Skewness-0.0006857784065
Sum190075394
Variance33288770.16
MonotocityStrictly increasing
2023-05-06T17:32:36.417360image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
01
 
< 0.1%
177141
 
< 0.1%
95181
 
< 0.1%
156611
 
< 0.1%
136121
 
< 0.1%
33711
 
< 0.1%
13221
 
< 0.1%
74651
 
< 0.1%
54161
 
< 0.1%
197471
 
< 0.1%
Other values (18991)18991
99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
ValueCountFrequency (%)
199991
< 0.1%
199981
< 0.1%
199971
< 0.1%
199961
< 0.1%
199951
< 0.1%

id
Real number (ℝ≥0)

UNIQUE

Distinct19001
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean18830405.22
Minimum2539
Maximum36485609
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:36.657748image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2539
5-th percentile1177971
Q19355498
median19387536
Q328919516
95-th percentile35240782
Maximum36485609
Range36483070
Interquartile range (IQR)19564018

Descriptive statistics

Standard deviation10969857.88
Coefficient of variation (CV)0.5825609034
Kurtosis-1.225663309
Mean18830405.22
Median Absolute Deviation (MAD)9804926
Skewness-0.06687371468
Sum3.577965297 × 1011
Variance1.203377819 × 1014
MonotocityNot monotonic
2023-05-06T17:32:36.856636image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
324931121
 
< 0.1%
363082661
 
< 0.1%
155095451
 
< 0.1%
130737011
 
< 0.1%
207753701
 
< 0.1%
27282931
 
< 0.1%
1691521
 
< 0.1%
32516381
 
< 0.1%
213087021
 
< 0.1%
81625871
 
< 0.1%
Other values (18991)18991
99.9%
ValueCountFrequency (%)
25391
< 0.1%
38311
< 0.1%
50221
< 0.1%
51211
< 0.1%
52031
< 0.1%
ValueCountFrequency (%)
364856091
< 0.1%
364850571
< 0.1%
364802921
< 0.1%
364797231
< 0.1%
364783431
< 0.1%

name
Categorical

HIGH CARDINALITY
UNIFORM

Distinct18780
Distinct (%)98.9%
Missing7
Missing (%)< 0.1%
Memory size148.6 KiB
Hillside Hotel
 
7
Brooklyn Apartment
 
7
Home away from home
 
6
Private Room
 
6
New york Multi-unit building
 
5
Other values (18775)
18963 

Length

Max length179
Median length36
Mean length36.75139518
Min length1

Characters and Unicode

Total characters698056
Distinct characters504
Distinct categories20 ?
Distinct scripts11 ?
Distinct blocks17 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18619 ?
Unique (%)98.0%

Sample

1st rowPrivate Lg Room 15 min to Manhattan
2nd rowTIME SQUARE CHARMING ONE BED IN HELL'S KITCHEN,NYC
3rd rowVoted #1 Location Quintessential 1BR W Village Apt
4th rowSpacious 1 bedroom apartment 15min from Manhattan
5th rowBig beautiful bedroom in huge Bushwick apartment
ValueCountFrequency (%)
Hillside Hotel7
 
< 0.1%
Brooklyn Apartment7
 
< 0.1%
Home away from home6
 
< 0.1%
Private Room6
 
< 0.1%
New york Multi-unit building5
 
< 0.1%
Private room in Manhattan5
 
< 0.1%
Cozy Room5
 
< 0.1%
Private room in Williamsburg4
 
< 0.1%
Cozy Private Room4
 
< 0.1%
Cozy home away from home3
 
< 0.1%
Other values (18770)18942
99.7%
(Missing)7
 
< 0.1%
2023-05-06T17:32:37.414212image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
in6594
 
5.7%
room4014
 
3.5%
3161
 
2.7%
bedroom3016
 
2.6%
private2874
 
2.5%
apartment2658
 
2.3%
cozy2017
 
1.7%
apt1765
 
1.5%
brooklyn1623
 
1.4%
studio1556
 
1.3%
Other values (6673)86285
74.7%

Most occurring characters

ValueCountFrequency (%)
97219
 
13.9%
e48065
 
6.9%
o47895
 
6.9%
t40878
 
5.9%
a40431
 
5.8%
r38293
 
5.5%
i36980
 
5.3%
n36723
 
5.3%
l20035
 
2.9%
m19434
 
2.8%
Other values (494)272103
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter469006
67.2%
Uppercase Letter102950
 
14.7%
Space Separator97221
 
13.9%
Other Punctuation12924
 
1.9%
Decimal Number9521
 
1.4%
Dash Punctuation2620
 
0.4%
Other Letter976
 
0.1%
Math Symbol970
 
0.1%
Close Punctuation621
 
0.1%
Open Punctuation568
 
0.1%
Other values (10)679
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
36
 
3.7%
22
 
2.3%
21
 
2.2%
19
 
1.9%
18
 
1.8%
18
 
1.8%
16
 
1.6%
15
 
1.5%
15
 
1.5%
13
 
1.3%
Other values (299)783
80.2%
ValueCountFrequency (%)
e48065
 
10.2%
o47895
 
10.2%
t40878
 
8.7%
a40431
 
8.6%
r38293
 
8.2%
i36980
 
7.9%
n36723
 
7.8%
l20035
 
4.3%
m19434
 
4.1%
s18729
 
4.0%
Other values (48)121543
25.9%
ValueCountFrequency (%)
95
28.5%
66
19.8%
31
 
9.3%
19
 
5.7%
14
 
4.2%
13
 
3.9%
13
 
3.9%
8
 
2.4%
7
 
2.1%
6
 
1.8%
Other values (27)61
18.3%
ValueCountFrequency (%)
B11484
 
11.2%
S10080
 
9.8%
C8176
 
7.9%
A7545
 
7.3%
R6917
 
6.7%
P5740
 
5.6%
E5345
 
5.2%
L5288
 
5.1%
M4520
 
4.4%
N4373
 
4.2%
Other values (23)33482
32.5%
ValueCountFrequency (%)
,3504
27.1%
!3055
23.6%
/1958
15.2%
.1683
13.0%
&1234
 
9.5%
'425
 
3.3%
*329
 
2.5%
:225
 
1.7%
#212
 
1.6%
"111
 
0.9%
Other values (9)188
 
1.5%
ValueCountFrequency (%)
13413
35.8%
22526
26.5%
3931
 
9.8%
5800
 
8.4%
0768
 
8.1%
4459
 
4.8%
6233
 
2.4%
8150
 
1.6%
7147
 
1.5%
994
 
1.0%
ValueCountFrequency (%)
+476
49.1%
|359
37.0%
~99
 
10.2%
=11
 
1.1%
>9
 
0.9%
<7
 
0.7%
6
 
0.6%
2
 
0.2%
×1
 
0.1%
ValueCountFrequency (%)
(544
95.8%
[16
 
2.8%
{4
 
0.7%
3
 
0.5%
1
 
0.2%
ValueCountFrequency (%)
)596
96.0%
]17
 
2.7%
}4
 
0.6%
3
 
0.5%
1
 
0.2%
ValueCountFrequency (%)
-2589
98.8%
16
 
0.6%
15
 
0.6%
ValueCountFrequency (%)
97219
> 99.9%
 2
 
< 0.1%
ValueCountFrequency (%)
89
86.4%
14
 
13.6%
ValueCountFrequency (%)
66
88.0%
9
 
12.0%
ValueCountFrequency (%)
15
78.9%
4
 
21.1%
ValueCountFrequency (%)
7
70.0%
3
30.0%
ValueCountFrequency (%)
^6
85.7%
`1
 
14.3%
ValueCountFrequency (%)
72
100.0%
ValueCountFrequency (%)
$34
100.0%
ValueCountFrequency (%)
_20
100.0%
ValueCountFrequency (%)
²6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin571873
81.9%
Common125049
 
17.9%
Han878
 
0.1%
Inherited75
 
< 0.1%
Cyrillic71
 
< 0.1%
Katakana45
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian12
 
< 0.1%
Hiragana12
 
< 0.1%
Hangul8
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
36
 
4.1%
22
 
2.5%
21
 
2.4%
19
 
2.2%
18
 
2.1%
18
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
13
 
1.5%
Other values (244)685
78.0%
ValueCountFrequency (%)
97219
77.7%
,3504
 
2.8%
13413
 
2.7%
!3055
 
2.4%
-2589
 
2.1%
22526
 
2.0%
/1958
 
1.6%
.1683
 
1.3%
&1234
 
1.0%
3931
 
0.7%
Other values (92)6937
 
5.5%
ValueCountFrequency (%)
e48065
 
8.4%
o47895
 
8.4%
t40878
 
7.1%
a40431
 
7.1%
r38293
 
6.7%
i36980
 
6.5%
n36723
 
6.4%
l20035
 
3.5%
m19434
 
3.4%
s18729
 
3.3%
Other values (61)224410
39.2%
ValueCountFrequency (%)
7
15.6%
5
 
11.1%
3
 
6.7%
3
 
6.7%
3
 
6.7%
3
 
6.7%
2
 
4.4%
2
 
4.4%
1
 
2.2%
1
 
2.2%
Other values (15)15
33.3%
ValueCountFrequency (%)
а10
14.1%
н7
9.9%
т7
9.9%
о6
 
8.5%
с6
 
8.5%
к5
 
7.0%
м4
 
5.6%
е4
 
5.6%
я3
 
4.2%
р3
 
4.2%
Other values (9)16
22.5%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
ValueCountFrequency (%)
66
88.0%
9
 
12.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
12
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII696288
99.7%
CJK878
 
0.1%
Punctuation187
 
< 0.1%
None179
 
< 0.1%
Misc Symbols176
 
< 0.1%
Dingbats130
 
< 0.1%
VS75
 
< 0.1%
Cyrillic71
 
< 0.1%
Hebrew31
 
< 0.1%
Georgian12
 
< 0.1%
Other values (7)29
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
97219
 
14.0%
e48065
 
6.9%
o47895
 
6.9%
t40878
 
5.9%
a40431
 
5.8%
r38293
 
5.5%
i36980
 
5.3%
n36723
 
5.3%
l20035
 
2.9%
m19434
 
2.8%
Other values (86)270335
38.8%
ValueCountFrequency (%)
66
50.8%
13
 
10.0%
13
 
10.0%
8
 
6.2%
7
 
5.4%
5
 
3.8%
4
 
3.1%
4
 
3.1%
3
 
2.3%
2
 
1.5%
Other values (3)5
 
3.8%
ValueCountFrequency (%)
36
 
4.1%
22
 
2.5%
21
 
2.4%
19
 
2.2%
18
 
2.1%
18
 
2.1%
16
 
1.8%
15
 
1.7%
15
 
1.7%
13
 
1.5%
Other values (244)685
78.0%
ValueCountFrequency (%)
89
47.6%
34
 
18.2%
16
 
8.6%
15
 
8.0%
15
 
8.0%
14
 
7.5%
4
 
2.1%
ValueCountFrequency (%)
95
54.0%
31
 
17.6%
14
 
8.0%
6
 
3.4%
5
 
2.8%
4
 
2.3%
3
 
1.7%
3
 
1.7%
2
 
1.1%
2
 
1.1%
Other values (6)11
 
6.2%
ValueCountFrequency (%)
19
 
10.6%
à15
 
8.4%
ó9
 
5.0%
7
 
3.9%
·7
 
3.9%
é7
 
3.9%
7
 
3.9%
7
 
3.9%
6
 
3.4%
²6
 
3.4%
Other values (51)89
49.7%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
66
88.0%
9
 
12.0%
ValueCountFrequency (%)
1
100.0%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
12
100.0%
ValueCountFrequency (%)
а10
14.1%
н7
9.9%
т7
9.9%
о6
 
8.5%
с6
 
8.5%
к5
 
7.0%
м4
 
5.6%
е4
 
5.6%
я3
 
4.2%
р3
 
4.2%
Other values (9)16
22.5%
ValueCountFrequency (%)
2
100.0%
ValueCountFrequency (%)
5
41.7%
2
 
16.7%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
1
 
8.3%
ValueCountFrequency (%)
י5
16.1%
ו5
16.1%
ב4
12.9%
ר4
12.9%
ע2
 
6.5%
ת2
 
6.5%
ה2
 
6.5%
ד1
 
3.2%
ש1
 
3.2%
ל1
 
3.2%
Other values (4)4
12.9%
ValueCountFrequency (%)
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
1
12.5%
ValueCountFrequency (%)
1
50.0%
1
50.0%

host_id
Real number (ℝ≥0)

Distinct16241
Distinct (%)85.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66394588.99
Minimum2571
Maximum274273284
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:37.644455image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum2571
5-th percentile764688
Q17728754
median30487854
Q3104835356
95-th percentile239519185
Maximum274273284
Range274270713
Interquartile range (IQR)97106602

Descriptive statistics

Standard deviation77826632.15
Coefficient of variation (CV)1.172183356
Kurtosis0.2898665825
Mean66394588.99
Median Absolute Deviation (MAD)27185317
Skewness1.245958822
Sum1.261563585 × 1012
Variance6.056984672 × 1015
MonotocityNot monotonic
2023-05-06T17:32:37.836021image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
219517861117
 
0.6%
10743442371
 
0.4%
3028359444
 
0.2%
13735886636
 
0.2%
1224305136
 
0.2%
6139196334
 
0.2%
1609895833
 
0.2%
2254157332
 
0.2%
2637726324
 
0.1%
285674822
 
0.1%
Other values (16231)18552
97.6%
ValueCountFrequency (%)
25711
 
< 0.1%
27873
< 0.1%
31511
 
< 0.1%
34151
 
< 0.1%
35631
 
< 0.1%
ValueCountFrequency (%)
2742732841
< 0.1%
2741954581
< 0.1%
2741033831
< 0.1%
2738701231
< 0.1%
2738416671
< 0.1%

host_name
Categorical

HIGH CARDINALITY

Distinct6307
Distinct (%)33.2%
Missing8
Missing (%)< 0.1%
Memory size148.6 KiB
Michael
 
159
David
 
157
John
 
130
Sonder (NYC)
 
117
Alex
 
98
Other values (6302)
18332 

Length

Max length35
Median length6
Mean length6.100931922
Min length1

Characters and Unicode

Total characters115875
Distinct characters138
Distinct categories11 ?
Distinct scripts6 ?
Distinct blocks7 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4077 ?
Unique (%)21.5%

Sample

1st rowIris
2nd rowJohlex
3rd rowJohn
4th rowRegan
5th rowMegan
ValueCountFrequency (%)
Michael159
 
0.8%
David157
 
0.8%
John130
 
0.7%
Sonder (NYC)117
 
0.6%
Alex98
 
0.5%
Daniel92
 
0.5%
Sarah87
 
0.5%
Maria86
 
0.5%
Chris81
 
0.4%
Anna77
 
0.4%
Other values (6297)17909
94.3%
2023-05-06T17:32:38.350830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
401
 
1.9%
and234
 
1.1%
michael175
 
0.8%
david169
 
0.8%
sonder153
 
0.7%
john145
 
0.7%
nyc122
 
0.6%
alex121
 
0.6%
laura117
 
0.6%
maria105
 
0.5%
Other values (5877)19369
91.7%

Most occurring characters

ValueCountFrequency (%)
a14760
 
12.7%
e11187
 
9.7%
i9409
 
8.1%
n9394
 
8.1%
r6989
 
6.0%
l5914
 
5.1%
o4954
 
4.3%
t3638
 
3.1%
s3533
 
3.0%
h3518
 
3.0%
Other values (128)42579
36.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter91499
79.0%
Uppercase Letter21211
 
18.3%
Space Separator2158
 
1.9%
Other Punctuation579
 
0.5%
Open Punctuation134
 
0.1%
Close Punctuation134
 
0.1%
Dash Punctuation76
 
0.1%
Other Letter39
 
< 0.1%
Decimal Number30
 
< 0.1%
Math Symbol14
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
a14760
16.1%
e11187
12.2%
i9409
10.3%
n9394
10.3%
r6989
 
7.6%
l5914
 
6.5%
o4954
 
5.4%
t3638
 
4.0%
s3533
 
3.9%
h3518
 
3.8%
Other values (46)18203
19.9%
ValueCountFrequency (%)
A2475
11.7%
J2094
 
9.9%
M2051
 
9.7%
S1821
 
8.6%
C1432
 
6.8%
L1124
 
5.3%
D1079
 
5.1%
K1024
 
4.8%
R989
 
4.7%
E937
 
4.4%
Other values (22)6185
29.2%
ValueCountFrequency (%)
3
 
7.7%
3
 
7.7%
3
 
7.7%
3
 
7.7%
2
 
5.1%
2
 
5.1%
2
 
5.1%
2
 
5.1%
1
 
2.6%
1
 
2.6%
Other values (17)17
43.6%
ValueCountFrequency (%)
&415
71.7%
.122
 
21.1%
/14
 
2.4%
,13
 
2.2%
'7
 
1.2%
!4
 
0.7%
@3
 
0.5%
:1
 
0.2%
ValueCountFrequency (%)
57
23.3%
05
16.7%
75
16.7%
14
13.3%
23
10.0%
43
10.0%
62
 
6.7%
31
 
3.3%
ValueCountFrequency (%)
2154
99.8%
4
 
0.2%
ValueCountFrequency (%)
(134
100.0%
ValueCountFrequency (%)
)134
100.0%
ValueCountFrequency (%)
-76
100.0%
ValueCountFrequency (%)
+14
100.0%
ValueCountFrequency (%)
£1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin112687
97.2%
Common3126
 
2.7%
Han34
 
< 0.1%
Cyrillic23
 
< 0.1%
Hiragana3
 
< 0.1%
Hangul2
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
a14760
 
13.1%
e11187
 
9.9%
i9409
 
8.3%
n9394
 
8.3%
r6989
 
6.2%
l5914
 
5.2%
o4954
 
4.4%
t3638
 
3.2%
s3533
 
3.1%
h3518
 
3.1%
Other values (63)39391
35.0%
ValueCountFrequency (%)
2154
68.9%
&415
 
13.3%
(134
 
4.3%
)134
 
4.3%
.122
 
3.9%
-76
 
2.4%
+14
 
0.4%
/14
 
0.4%
,13
 
0.4%
57
 
0.2%
Other values (13)43
 
1.4%
ValueCountFrequency (%)
3
 
8.8%
3
 
8.8%
3
 
8.8%
3
 
8.8%
2
 
5.9%
2
 
5.9%
2
 
5.9%
2
 
5.9%
1
 
2.9%
1
 
2.9%
Other values (12)12
35.3%
ValueCountFrequency (%)
А3
13.0%
е3
13.0%
н2
 
8.7%
й2
 
8.7%
и2
 
8.7%
л2
 
8.7%
д1
 
4.3%
р1
 
4.3%
т1
 
4.3%
а1
 
4.3%
Other values (5)5
21.7%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
ValueCountFrequency (%)
1
50.0%
1
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII115704
99.9%
None105
 
0.1%
CJK34
 
< 0.1%
Cyrillic23
 
< 0.1%
Punctuation4
 
< 0.1%
Hiragana3
 
< 0.1%
Hangul2
 
< 0.1%

Most frequent character per block

ValueCountFrequency (%)
a14760
 
12.8%
e11187
 
9.7%
i9409
 
8.1%
n9394
 
8.1%
r6989
 
6.0%
l5914
 
5.1%
o4954
 
4.3%
t3638
 
3.1%
s3533
 
3.1%
h3518
 
3.0%
Other values (63)42408
36.7%
ValueCountFrequency (%)
é42
40.0%
á10
 
9.5%
í9
 
8.6%
ë8
 
7.6%
ô6
 
5.7%
è6
 
5.7%
ú5
 
4.8%
ó2
 
1.9%
ç2
 
1.9%
ı2
 
1.9%
Other values (12)13
 
12.4%
ValueCountFrequency (%)
3
 
8.8%
3
 
8.8%
3
 
8.8%
3
 
8.8%
2
 
5.9%
2
 
5.9%
2
 
5.9%
2
 
5.9%
1
 
2.9%
1
 
2.9%
Other values (12)12
35.3%
ValueCountFrequency (%)
4
100.0%
ValueCountFrequency (%)
А3
13.0%
е3
13.0%
н2
 
8.7%
й2
 
8.7%
и2
 
8.7%
л2
 
8.7%
д1
 
4.3%
р1
 
4.3%
т1
 
4.3%
а1
 
4.3%
Other values (5)5
21.7%
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
ValueCountFrequency (%)
1
50.0%
1
50.0%
Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size148.6 KiB
Brooklyn
8046 
Manhattan
8031 
Queens
2331 
Bronx
 
434
Staten Island
 
159

Length

Max length13
Median length8
Mean length8.150623651
Min length5

Characters and Unicode

Total characters154870
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowQueens
2nd rowManhattan
3rd rowManhattan
4th rowQueens
5th rowBrooklyn
ValueCountFrequency (%)
Brooklyn8046
42.3%
Manhattan8031
42.3%
Queens2331
 
12.3%
Bronx434
 
2.3%
Staten Island159
 
0.8%
2023-05-06T17:32:38.807620image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2023-05-06T17:32:38.953057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
brooklyn8046
42.0%
manhattan8031
41.9%
queens2331
 
12.2%
bronx434
 
2.3%
island159
 
0.8%
staten159
 
0.8%

Most occurring characters

ValueCountFrequency (%)
n27191
17.6%
a24411
15.8%
o16526
10.7%
t16380
10.6%
B8480
 
5.5%
r8480
 
5.5%
l8205
 
5.3%
k8046
 
5.2%
y8046
 
5.2%
M8031
 
5.2%
Other values (10)21074
13.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter135551
87.5%
Uppercase Letter19160
 
12.4%
Space Separator159
 
0.1%

Most frequent character per category

ValueCountFrequency (%)
n27191
20.1%
a24411
18.0%
o16526
12.2%
t16380
12.1%
r8480
 
6.3%
l8205
 
6.1%
k8046
 
5.9%
y8046
 
5.9%
h8031
 
5.9%
e4821
 
3.6%
Other values (4)5414
 
4.0%
ValueCountFrequency (%)
B8480
44.3%
M8031
41.9%
Q2331
 
12.2%
S159
 
0.8%
I159
 
0.8%
ValueCountFrequency (%)
159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin154711
99.9%
Common159
 
0.1%

Most frequent character per script

ValueCountFrequency (%)
n27191
17.6%
a24411
15.8%
o16526
10.7%
t16380
10.6%
B8480
 
5.5%
r8480
 
5.5%
l8205
 
5.3%
k8046
 
5.2%
y8046
 
5.2%
M8031
 
5.2%
Other values (9)20915
13.5%
ValueCountFrequency (%)
159
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII154870
100.0%

Most frequent character per block

ValueCountFrequency (%)
n27191
17.6%
a24411
15.8%
o16526
10.7%
t16380
10.6%
B8480
 
5.5%
r8480
 
5.5%
l8205
 
5.3%
k8046
 
5.2%
y8046
 
5.2%
M8031
 
5.2%
Other values (10)21074
13.6%

neighbourhood
Categorical

HIGH CARDINALITY

Distinct215
Distinct (%)1.1%
Missing0
Missing (%)0.0%
Memory size148.6 KiB
Williamsburg
1526 
Bedford-Stuyvesant
1478 
Harlem
 
1086
Bushwick
 
978
Upper West Side
 
734
Other values (210)
13199 

Length

Max length26
Median length12
Mean length11.91947792
Min length4

Characters and Unicode

Total characters226482
Distinct characters54
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique19 ?
Unique (%)0.1%

Sample

1st rowSunnyside
2nd rowHell's Kitchen
3rd rowWest Village
4th rowAstoria
5th rowBushwick
ValueCountFrequency (%)
Williamsburg1526
 
8.0%
Bedford-Stuyvesant1478
 
7.8%
Harlem1086
 
5.7%
Bushwick978
 
5.1%
Upper West Side734
 
3.9%
East Village705
 
3.7%
Hell's Kitchen693
 
3.6%
Upper East Side667
 
3.5%
Crown Heights631
 
3.3%
Midtown505
 
2.7%
Other values (205)9998
52.6%
2023-05-06T17:32:39.423961image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
east2575
 
8.4%
side1756
 
5.7%
harlem1540
 
5.0%
williamsburg1526
 
5.0%
bedford-stuyvesant1478
 
4.8%
heights1427
 
4.7%
upper1401
 
4.6%
village1186
 
3.9%
west1033
 
3.4%
bushwick978
 
3.2%
Other values (227)15760
51.4%

Most occurring characters

ValueCountFrequency (%)
e20714
 
9.1%
i16226
 
7.2%
s15652
 
6.9%
t15047
 
6.6%
a14816
 
6.5%
l13284
 
5.9%
r13260
 
5.9%
11659
 
5.1%
n10202
 
4.5%
o9383
 
4.1%
Other values (44)86239
38.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter179812
79.4%
Uppercase Letter32558
 
14.4%
Space Separator11659
 
5.1%
Dash Punctuation1707
 
0.8%
Other Punctuation746
 
0.3%

Most frequent character per category

ValueCountFrequency (%)
e20714
11.5%
i16226
 
9.0%
s15652
 
8.7%
t15047
 
8.4%
a14816
 
8.2%
l13284
 
7.4%
r13260
 
7.4%
n10202
 
5.7%
o9383
 
5.2%
d7624
 
4.2%
Other values (15)43604
24.2%
ValueCountFrequency (%)
H4667
14.3%
S4437
13.6%
B3298
10.1%
W3154
9.7%
E2772
8.5%
C2107
 
6.5%
G1464
 
4.5%
U1429
 
4.4%
F1324
 
4.1%
V1208
 
3.7%
Other values (14)6698
20.6%
ValueCountFrequency (%)
'697
93.4%
.48
 
6.4%
,1
 
0.1%
ValueCountFrequency (%)
11659
100.0%
ValueCountFrequency (%)
-1707
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin212370
93.8%
Common14112
 
6.2%

Most frequent character per script

ValueCountFrequency (%)
e20714
 
9.8%
i16226
 
7.6%
s15652
 
7.4%
t15047
 
7.1%
a14816
 
7.0%
l13284
 
6.3%
r13260
 
6.2%
n10202
 
4.8%
o9383
 
4.4%
d7624
 
3.6%
Other values (39)76162
35.9%
ValueCountFrequency (%)
11659
82.6%
-1707
 
12.1%
'697
 
4.9%
.48
 
0.3%
,1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII226482
100.0%

Most frequent character per block

ValueCountFrequency (%)
e20714
 
9.1%
i16226
 
7.2%
s15652
 
6.9%
t15047
 
6.6%
a14816
 
6.5%
l13284
 
5.9%
r13260
 
5.9%
11659
 
5.1%
n10202
 
4.5%
o9383
 
4.1%
Other values (44)86239
38.1%

latitude
Real number (ℝ≥0)

Distinct12087
Distinct (%)63.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.72806312
Minimum40.50873
Maximum40.91306
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:39.618396image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum40.50873
5-th percentile40.64456
Q140.68882
median40.72171
Q340.76321
95-th percentile40.82645
Maximum40.91306
Range0.40433
Interquartile range (IQR)0.07439

Descriptive statistics

Standard deviation0.05538931047
Coefficient of variation (CV)0.001359978998
Kurtosis0.05864487992
Mean40.72806312
Median Absolute Deviation (MAD)0.03659
Skewness0.2542518047
Sum773873.9274
Variance0.003067975714
MonotocityNot monotonic
2023-05-06T17:32:39.828887image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
40.722328
 
< 0.1%
40.694148
 
< 0.1%
40.686348
 
< 0.1%
40.718138
 
< 0.1%
40.726077
 
< 0.1%
40.680847
 
< 0.1%
40.686837
 
< 0.1%
40.707416
 
< 0.1%
40.725226
 
< 0.1%
40.722766
 
< 0.1%
Other values (12077)18930
99.6%
ValueCountFrequency (%)
40.508731
< 0.1%
40.522931
< 0.1%
40.538711
< 0.1%
40.539391
< 0.1%
40.541061
< 0.1%
ValueCountFrequency (%)
40.913061
< 0.1%
40.905271
< 0.1%
40.903911
< 0.1%
40.903561
< 0.1%
40.903291
< 0.1%

longitude
Real number (ℝ)

Distinct9944
Distinct (%)52.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-73.95082679
Minimum-74.23914
Maximum-73.71795
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:40.060191image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum-74.23914
5-th percentile-74.00369
Q1-73.98205
median-73.95463
Q3-73.93449
95-th percentile-73.86186
Maximum-73.71795
Range0.52119
Interquartile range (IQR)0.04756

Descriptive statistics

Standard deviation0.04682498719
Coefficient of variation (CV)-0.0006331908543
Kurtosis4.824279693
Mean-73.95082679
Median Absolute Deviation (MAD)0.02489
Skewness1.234020827
Sum-1405139.66
Variance0.002192579425
MonotocityNot monotonic
2023-05-06T17:32:40.270839image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-73.957429
 
< 0.1%
-73.954279
 
< 0.1%
-73.951219
 
< 0.1%
-73.957259
 
< 0.1%
-73.985899
 
< 0.1%
-73.948299
 
< 0.1%
-73.956778
 
< 0.1%
-73.945248
 
< 0.1%
-73.93988
 
< 0.1%
-73.951498
 
< 0.1%
Other values (9934)18915
99.5%
ValueCountFrequency (%)
-74.239141
< 0.1%
-74.212381
< 0.1%
-74.196261
< 0.1%
-74.182591
< 0.1%
-74.176281
< 0.1%
ValueCountFrequency (%)
-73.717951
< 0.1%
-73.718291
< 0.1%
-73.725821
< 0.1%
-73.727161
< 0.1%
-73.727311
< 0.1%

room_type
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size148.6 KiB
Entire home/apt
9522 
Private room
9041 
Shared room
 
438

Length

Max length15
Median length15
Mean length13.48034314
Min length11

Characters and Unicode

Total characters256140
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPrivate room
2nd rowEntire home/apt
3rd rowEntire home/apt
4th rowEntire home/apt
5th rowPrivate room
ValueCountFrequency (%)
Entire home/apt9522
50.1%
Private room9041
47.6%
Shared room438
 
2.3%
2023-05-06T17:32:40.659266image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2023-05-06T17:32:40.788706image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
entire9522
25.1%
home/apt9522
25.1%
room9479
24.9%
private9041
23.8%
shared438
 
1.2%

Most occurring characters

ValueCountFrequency (%)
e28523
11.1%
r28480
11.1%
o28480
11.1%
t28085
11.0%
a19001
 
7.4%
19001
 
7.4%
m19001
 
7.4%
i18563
 
7.2%
h9960
 
3.9%
E9522
 
3.7%
Other values (7)47524
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter208616
81.4%
Uppercase Letter19001
 
7.4%
Space Separator19001
 
7.4%
Other Punctuation9522
 
3.7%

Most frequent character per category

ValueCountFrequency (%)
e28523
13.7%
r28480
13.7%
o28480
13.7%
t28085
13.5%
a19001
9.1%
m19001
9.1%
i18563
8.9%
h9960
 
4.8%
n9522
 
4.6%
p9522
 
4.6%
Other values (2)9479
 
4.5%
ValueCountFrequency (%)
E9522
50.1%
P9041
47.6%
S438
 
2.3%
ValueCountFrequency (%)
19001
100.0%
ValueCountFrequency (%)
/9522
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin227617
88.9%
Common28523
 
11.1%

Most frequent character per script

ValueCountFrequency (%)
e28523
12.5%
r28480
12.5%
o28480
12.5%
t28085
12.3%
a19001
8.3%
m19001
8.3%
i18563
8.2%
h9960
 
4.4%
E9522
 
4.2%
n9522
 
4.2%
Other values (5)28480
12.5%
ValueCountFrequency (%)
19001
66.6%
/9522
33.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII256140
100.0%

Most frequent character per block

ValueCountFrequency (%)
e28523
11.1%
r28480
11.1%
o28480
11.1%
t28085
11.0%
a19001
 
7.4%
19001
 
7.4%
m19001
 
7.4%
i18563
 
7.2%
h9960
 
3.9%
E9522
 
3.7%
Other values (7)47524
18.6%

price
Real number (ℝ≥0)

Distinct321
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean122.3404558
Minimum10
Maximum350
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:40.953547image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile40
Q166
median100
Q3160
95-th percentile270
Maximum350
Range340
Interquartile range (IQR)94

Descriptive statistics

Standard deviation71.53034564
Coefficient of variation (CV)0.5846826807
Kurtosis0.5028006941
Mean122.3404558
Median Absolute Deviation (MAD)45
Skewness1.027024411
Sum2324591
Variance5116.590347
MonotocityNot monotonic
2023-05-06T17:32:41.145249image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100856
 
4.5%
150821
 
4.3%
50636
 
3.3%
200590
 
3.1%
75570
 
3.0%
60555
 
2.9%
80531
 
2.8%
70482
 
2.5%
120471
 
2.5%
65471
 
2.5%
Other values (311)13018
68.5%
ValueCountFrequency (%)
106
< 0.1%
112
 
< 0.1%
121
 
< 0.1%
131
 
< 0.1%
151
 
< 0.1%
ValueCountFrequency (%)
350147
0.8%
34914
 
0.1%
3481
 
< 0.1%
3472
 
< 0.1%
3461
 
< 0.1%

minimum_nights
Real number (ℝ≥0)

ZEROS

Distinct75
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.121219818
Minimum0
Maximum7.13089883
Zeros5003
Zeros (%)26.3%
Memory size148.6 KiB
2023-05-06T17:32:41.363690image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0.6931471806
Q31.609437912
95-th percentile3.401197382
Maximum7.13089883
Range7.13089883
Interquartile range (IQR)1.609437912

Descriptive statistics

Standard deviation1.064894555
Coefficient of variation (CV)0.9497642994
Kurtosis0.9149991731
Mean1.121219818
Median Absolute Deviation (MAD)0.6931471806
Skewness1.128676634
Sum21304.29777
Variance1.134000414
MonotocityNot monotonic
2023-05-06T17:32:41.555605image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
05003
26.3%
0.69314718064619
24.3%
1.0986122893086
16.2%
3.4011973821442
 
7.6%
1.3862943611253
 
6.6%
1.6094379121140
 
6.0%
1.945910149822
 
4.3%
1.791759469285
 
1.5%
2.63905733210
 
1.1%
2.302585093178
 
0.9%
Other values (65)963
 
5.1%
ValueCountFrequency (%)
05003
26.3%
0.69314718064619
24.3%
1.0986122893086
16.2%
1.3862943611253
 
6.6%
1.6094379121140
 
6.0%
ValueCountFrequency (%)
7.130898831
< 0.1%
6.9067547792
< 0.1%
6.1737861041
< 0.1%
5.9914645471
< 0.1%
5.9135030061
< 0.1%

number_of_reviews
Real number (ℝ≥0)

ZEROS

Distinct321
Distinct (%)1.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.79774749
Minimum0
Maximum607
Zeros3758
Zeros (%)19.8%
Memory size148.6 KiB
2023-05-06T17:32:41.776907image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median6
Q324
95-th percentile116
Maximum607
Range607
Interquartile range (IQR)23

Descriptive statistics

Standard deviation45.49345456
Coefficient of variation (CV)1.911670614
Kurtosis19.56592146
Mean23.79774749
Median Absolute Deviation (MAD)6
Skewness3.706018813
Sum452181
Variance2069.654408
MonotocityNot monotonic
2023-05-06T17:32:41.973979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
03758
19.8%
12022
 
10.6%
21342
 
7.1%
3975
 
5.1%
4796
 
4.2%
5589
 
3.1%
6544
 
2.9%
7495
 
2.6%
8449
 
2.4%
9383
 
2.0%
Other values (311)7648
40.3%
ValueCountFrequency (%)
03758
19.8%
12022
10.6%
21342
 
7.1%
3975
 
5.1%
4796
 
4.2%
ValueCountFrequency (%)
6071
< 0.1%
5941
< 0.1%
5101
< 0.1%
4881
< 0.1%
4741
< 0.1%

last_review
Date

MISSING

Distinct1494
Distinct (%)9.8%
Missing3758
Missing (%)19.8%
Memory size148.6 KiB
Minimum2011-05-12 00:00:00
Maximum2019-07-08 00:00:00
2023-05-06T17:32:42.233498image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:42.477298image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

reviews_per_month
Real number (ℝ≥0)

MISSING

Distinct789
Distinct (%)5.2%
Missing3758
Missing (%)19.8%
Infinite0
Infinite (%)0.0%
Mean1.380927639
Minimum0.01
Maximum27.95
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:42.689333image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0.01
5-th percentile0.04
Q10.19
median0.72
Q32.01
95-th percentile4.69
Maximum27.95
Range27.94
Interquartile range (IQR)1.82

Descriptive statistics

Standard deviation1.689988413
Coefficient of variation (CV)1.22380664
Kurtosis11.9516004
Mean1.380927639
Median Absolute Deviation (MAD)0.62
Skewness2.435330583
Sum21049.48
Variance2.856060837
MonotocityNot monotonic
2023-05-06T17:32:43.159886image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.02362
 
1.9%
1352
 
1.9%
0.05333
 
1.8%
0.03323
 
1.7%
0.04268
 
1.4%
0.08256
 
1.3%
0.16244
 
1.3%
0.09235
 
1.2%
0.06226
 
1.2%
0.11217
 
1.1%
Other values (779)12427
65.4%
(Missing)3758
 
19.8%
ValueCountFrequency (%)
0.0117
 
0.1%
0.02362
1.9%
0.03323
1.7%
0.04268
1.4%
0.05333
1.8%
ValueCountFrequency (%)
27.951
< 0.1%
20.941
< 0.1%
19.751
< 0.1%
17.821
< 0.1%
16.221
< 0.1%

calculated_host_listings_count
Real number (ℝ≥0)

Distinct47
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.583811378
Minimum1
Maximum327
Zeros0
Zeros (%)0.0%
Memory size148.6 KiB
2023-05-06T17:32:43.378119image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q32
95-th percentile13
Maximum327
Range326
Interquartile range (IQR)1

Descriptive statistics

Standard deviation31.15475023
Coefficient of variation (CV)4.732023511
Kurtosis77.21481439
Mean6.583811378
Median Absolute Deviation (MAD)0
Skewness8.461222905
Sum125099
Variance970.6184621
MonotocityNot monotonic
2023-05-06T17:32:43.582137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
112627
66.5%
22605
 
13.7%
31123
 
5.9%
4539
 
2.8%
5345
 
1.8%
6208
 
1.1%
7165
 
0.9%
8164
 
0.9%
327117
 
0.6%
998
 
0.5%
Other values (37)1010
 
5.3%
ValueCountFrequency (%)
112627
66.5%
22605
 
13.7%
31123
 
5.9%
4539
 
2.8%
5345
 
1.8%
ValueCountFrequency (%)
327117
0.6%
23271
0.4%
12144
 
0.2%
10336
 
0.2%
9669
0.4%

availability_365
Real number (ℝ≥0)

ZEROS

Distinct366
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean109.7253829
Minimum0
Maximum365
Zeros6970
Zeros (%)36.7%
Memory size148.6 KiB
2023-05-06T17:32:43.813659image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median39
Q3219
95-th percentile358
Maximum365
Range365
Interquartile range (IQR)219

Descriptive statistics

Standard deviation130.5998995
Coefficient of variation (CV)1.190243279
Kurtosis-0.9348291358
Mean109.7253829
Median Absolute Deviation (MAD)39
Skewness0.8027018468
Sum2084892
Variance17056.33374
MonotocityNot monotonic
2023-05-06T17:32:44.023525image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
06970
36.7%
365432
 
2.3%
364195
 
1.0%
1176
 
0.9%
5133
 
0.7%
3115
 
0.6%
2113
 
0.6%
179112
 
0.6%
89111
 
0.6%
6110
 
0.6%
Other values (356)10534
55.4%
ValueCountFrequency (%)
06970
36.7%
1176
 
0.9%
2113
 
0.6%
3115
 
0.6%
4107
 
0.6%
ValueCountFrequency (%)
365432
2.3%
364195
1.0%
36388
 
0.5%
36262
 
0.3%
36144
 
0.2%

Interactions

2023-05-06T17:32:11.025771image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:11.252928image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:11.570365image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:11.827437image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:12.038350image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:12.258110image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:12.474115image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:12.691877image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:12.898658image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:13.117540image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:13.340663image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:13.560524image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:13.766105image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:13.995486image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:14.199223image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:14.410586image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:14.624666image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:14.836256image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:15.039378image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:15.250099image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:15.467232image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:15.679699image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:15.883421image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:16.093207image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:16.285257image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:16.483903image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:16.683411image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:16.883080image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:17.078332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:17.274154image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:17.473112image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:17.715016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:17.942535image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:18.284639image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:18.512030image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:18.736163image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:18.959183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:19.176504image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:19.385669image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:19.608055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:19.835440image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:20.045346image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:20.247371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:20.439382image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:20.651819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:20.847624image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:21.050137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:21.249234image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:21.428820image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:21.625958image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:21.831492image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:22.048724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:22.259990image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:22.461652image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:22.681709image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:22.883473image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:23.090102image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:23.292601image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:23.489447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:23.696763image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:23.906332image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:24.124724image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:24.337218image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:24.537439image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:24.764146image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:24.965397image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:25.169742image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:25.371723image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:25.567122image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:25.776311image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:25.985349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:26.353701image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:26.559687image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:26.759830image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:26.989964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:27.187400image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:27.392964image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:27.592057image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:27.784055image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:27.985325image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:28.188428image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:28.394585image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:28.590840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:28.776721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:28.983711image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:29.160666image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:29.350447image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:29.544347image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:29.733971image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:29.927818image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:30.124561image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:30.337977image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:30.544722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:30.742325image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:30.961778image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:31.157063image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:31.361988image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:31.565691image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:31.771840image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:31.969957image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:32.176141image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:32.398077image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:32.612565image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:32.816552image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:33.041490image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:33.242516image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:33.453043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:33.659883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:33.864020image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2023-05-06T17:32:34.061740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2023-05-06T17:32:44.243405image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2023-05-06T17:32:44.583304image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2023-05-06T17:32:44.921200image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2023-05-06T17:32:45.282073image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2023-05-06T17:32:45.616994image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2023-05-06T17:32:34.443339image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2023-05-06T17:32:35.036942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-05-06T17:32:35.412962image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2023-05-06T17:32:35.607025image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

df_indexidnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
009138664Private Lg Room 15 min to Manhattan47594947IrisQueensSunnyside40.74271-73.92493Private room740.69314762019-05-260.1315
1131444015TIME SQUARE CHARMING ONE BED IN HELL'S KITCHEN,NYC8523790JohlexManhattanHell's Kitchen40.76682-73.98878Entire home/apt1701.0986120NaTNaN1188
228741020Voted #1 Location Quintessential 1BR W Village Apt45854238JohnManhattanWest Village40.73631-74.00611Entire home/apt2451.098612512018-09-191.1210
3334602077Spacious 1 bedroom apartment 15min from Manhattan261055465ReganQueensAstoria40.76424-73.92351Entire home/apt1251.09861212019-05-240.65113
4423203149Big beautiful bedroom in huge Bushwick apartment143460MeganBrooklynBushwick40.69839-73.92044Private room650.69314782019-06-230.5228
554402805LRG 2br BKLYN APT CLOSE TO TRAINS AND PARK22807362JennyBrooklynProspect-Lefferts Gardens40.66025-73.96270Entire home/apt1201.09861232018-08-280.05116
6630070126✩Prime Renovated 1/1 Apartment in Upper East Side✩4968673SeanManhattanUpper East Side40.76831-73.95929Entire home/apt2001.60943822019-05-260.68171
7734231172Fully renovated brick house floor in Brooklyn59642348KevinBrooklynSunset Park40.64550-74.01262Entire home/apt950.00000092019-07-089.001106
885856760Renovated 1BR in exciting, convenient area29408349ChadManhattanChinatown40.71490-73.99976Entire home/apt1791.60943872017-04-180.1410
997929441Beautiful Loft w/ Waterfront View!1453898AnthonyBrooklynWilliamsburg40.71268-73.96676Private room1050.6931472322019-06-195.00364

Last rows

df_indexidnamehost_idhost_nameneighbourhood_groupneighbourhoodlatitudelongituderoom_typepriceminimum_nightsnumber_of_reviewslast_reviewreviews_per_monthcalculated_host_listings_countavailability_365
18991199905192459Quiet Room in 4BR UWS Brownstone10677483GregManhattanUpper West Side40.80173-73.96625Private room700.0000000NaTNaN10
18992199911327940Huge Gorgeous Park View Apartment!3290436HadarBrooklynFlatbush40.65335-73.96257Entire home/apt1201.098612132016-08-270.282327
189931999223612681Shared Room 1 Stop from Manhattan on the F Train55724558TaylorQueensLong Island City40.76006-73.94080Private room551.38629422019-06-010.65589
189941999334485745Midtown Manhattan Stunner - Private room261632622RoyaltonManhattanTheater District40.75491-73.98507Private room1000.00000032019-06-163.009318
189951999425616250Stylish, spacious, private 1BR apt in Ditmas Park125396920AdamBrooklynFlatbush40.64314-73.95705Entire home/apt751.098612102019-01-030.8410
18996199957094539Tranquil haven in bubbly Brooklyn2052211AdrianaBrooklynWindsor Terrace40.65360-73.97546Entire home/apt1432.63905722016-08-270.04110
18997199964424261Large 1 BR with backyard on UWS3447311SarahManhattanUpper West Side40.80188-73.96808Entire home/apt2000.693147222019-05-210.5010
18998199974545882Amazing studio/Loft with a backyard23569951KavehManhattanUpper East Side40.78110-73.94567Entire home/apt2201.098612282019-05-230.501293
189991999826518547U2 comfortable double bed sleeps 2 guests295128Carol GloriaBronxClason Point40.81225-73.85502Private room800.00000042019-07-011.487365
190001999933631782Private Bedroom in Williamsburg Apt!8569221AndiBrooklynWilliamsburg40.71829-73.95819Private room1091.09861232019-04-281.07297